The Mission of TextGrid
Aim of TextGrid is to create a virtual research environment (VRE) for humanities scholars which should includes various tools and services for analysis, evaluation and publication of cultural remains such as texts, codices and images. A basic idea was the bringing together of instruments for dealing with texts, because texts as such are a crucial issue in the humanities. They are not only media for the communication of contents or research results, but also primary sources or base material for further studies. Therefore, the architecture of the created workspace is basically designed for working with texts. TextGrid also serves the storage and re-use of electronic text data. To scholars TextGrid offers the opportunity to analyze, evaluate and publish texts and image scans as well as to establish connections between these media. That makes the research environment useful for representatives of several scientific fields: Linguistics, Literary Studies, Musicology, Visual Culture. At the same time software developers are able to participate because the architecture of TextGrid is extensible, which means it has interfaces to include additional tools and services gradually.
The idea or concept of TextGrid is quite simple: The writing desk of the researcher in the real world is substituted by the so-called TextGrid Laboratory (Lab), an open source software which is available as portable file. It is based on Eclipse and can be stored on a USB flash drive and thus be started from any computer. The academic library is symbolized by the TextGrid Repository (Rep) which is accessible via internet and which enables lasting storage and re-use of research data. The focal point of the Lab is the XML Editor for text data processing. The markup language XML is most convenient for the research environment because all sorts of structured information can be displayed and administrated in standardized form. Furthermore, XML is extensible and adaptable to special needs. All in all, TextGrid addresses three main groups of users: scholars using TextGrid for their projects, developers implementing services and tools in TextGrid and content providers (archives etc.) that want to integrate their data into TextGrid.
Why using TextGrid?
What benefits can TextGrid offer to users? The virtual research environment mostly complies with the needs of scholars who want to work within a community of locally dispersed researchers. Many projects in the humanities have a tendency not only to gather an interdisciplinary team of scholars with different points of view; their research topics are also increasingly treated by partners who live and work apart from each other in different countries which makes the working practice more and more complex. In this case, an appropriate user and document management for electronic data processing is useful. Scholars in a scattered project team require standards in working processes. They want to exchange their research data smoothly at short notice and to keep them long-time available in a save place even in the world wide web. Working instruments should be flexible and adaptable to various research methods. Especially scholarly editors often need to share a great amount of archived text data in order to edit it simultaneously. To facilitate the workflow it is useful for them to have free access to their data as well as a common workspace where they can use the same tools or instruments for efficient text retrieval or annotation under a common graphical user interface. In addition, it is helpful to document an intermediate state of the editing process, so that other users can trace it back.
A virtual research environment meets those needs and wishes. It makes possible new forms of internet-based collaboration. The following requirements are fulfilled:
Administration and organisation of workflow
These consist of
- sophisticated user management (i.e. the distribution of roles and rights for different project partners).
- efficient document management (that means TextGrid manages projects and objects with different relations between them).
- management of the working process, that means it is possible to save interim states of editing in the form of revisions and to have some kind of locking mechanism.
Decentralized working
The software enables the user to start the Laboratory location-independently from a portable storage medium.
Standardization
The working with controlled metadata vocabulary and open standards facilitates the exchange of data, text retrieval and digital archiving.
Expandability
Open interfaces make the whole system flexible and appropriate for the addition of new tools and services.
Modularity
The TextGrid Laboratory is characterized by a modular order with the XML Editor as core element, which the VRE is built on. A large number of tools and services is associated with the XML Editor. This modular structure is supposed to facilitate the working process in the virtual research environment. On the one hand many tools can be arranged and combined according to own requirements like in a construction kit. On the other hand open interfaces allow the integration of new specially developed applications. Some tools allow the insertion of elements into the source text of the XML Editor; others have a structural connection to the XML Editor.
Assignment of User Rights
As mentioned above TextGrid
provides a sophisticated user rights management. Each TextGrid
Project has at least one Project Manager who has the right to
delegate, i.e. assign and remove role to/from other users. With
this Role Based Access Control
(RBAC) the user roles can even be assigned to users
temporarily, e.g. to participate in a project to perform a specific
task. Only users holding a role in a certain Project can access its
unpublished content in the way the role grants: Observers are only
allowed to read its documents whereas Editors have the additional
right to modify TextGrid Objects and save their changes. In this
way the Project Manager(s) of a TextGrid Project can control the
accessibility of their data and thus guarantee reliable working and
storing conditions in a worldwide accessible research
infrastructure.
Distributed Storage
The TextGrid Repository has a distributed storage which consists of two main parts. The activation, creation and editing of files takes place in the TextGrid Repository. The data is stored and secured in a search index and has no persistent identifier. This so-called Search Index 1 is dynamic whoch means it contains data which can be only accessed by the user of the Lab who has been assigned special rights via authentication. This data can be changed and deleted if one has the authority to edit and delete respectively. Other users can be invited to this process. After publication the data is transferred to the second search index. The data disappears from Search Index 1 and is provided with a persistent identifier (a code consisting of numbers and letters) in order to be referenced uniquely. The files transferred this way can not be changed anymore. Afterwards modifications can be carried out only by copying and re-importing a file back to Index 1.
Search Index 2 is accessible to everyone in the world wide web. Its character is static which means its data is unchangeable. Huge data amounts can be uploaded from other archives or publication servers via a special interface. Search Index 2 can be equally browsed by Lab users and web users. After a search with the Search Tool in TextGridLab the user gets a list of results. These search results are an accurate reflection of the TextGridRep inventory. They are presented in exactly the same order as the search results in the Repository which appear after a search has been submitted on the TextGridRep website.